The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
脑小血管疾病的成像标记提供了有关脑部健康的宝贵信息,但是它们的手动评估既耗时又受到实质性内部和间际变异性的阻碍。自动化评级可能受益于生物医学研究以及临床评估,但是现有算法的诊断可靠性尚不清楚。在这里,我们介绍了\ textIt {血管病变检测和分割}(\ textit {v textit {where valdo?})挑战,该挑战是在国际医学图像计算和计算机辅助干预措施(MICCAI)的卫星事件中运行的挑战(MICCAI) 2021.这一挑战旨在促进大脑小血管疾病的小而稀疏成像标记的自动检测和分割方法的开发,即周围空间扩大(EPVS)(任务1),脑微粒(任务2)和预先塑造的鞋类血管起源(任务3),同时利用弱和嘈杂的标签。总体而言,有12个团队参与了针对一个或多个任务的解决方案的挑战(任务1 -EPVS 4,任务2 -Microbleeds的9个,任务3 -lacunes的6个)。多方数据都用于培训和评估。结果表明,整个团队和跨任务的性能都有很大的差异,对于任务1- EPV和任务2-微型微型且对任务3 -lacunes尚无实际的结果,其结果尤其有望。它还强调了可能阻止个人级别使用的情况的性能不一致,同时仍证明在人群层面上有用。
translated by 谷歌翻译
颅内动脉瘤(UIA)的生长是破裂的预测指标。因此,为了进一步的成像监视和治疗计划,重要的是能够预测UIA是否会根据初始基线飞行时间MRA(TOF-MRA)增长。众所周知,UIA的大小和形状是动脉瘤生长和/或破裂的预测指标。我们对使用网状卷积神经网络进行基线TOF-MRA的未来UIA增长预测进行了可行性研究。我们包括151个TOF-MRA,其中169个UIA基于生长的临床定义,其中49个UIA被归类为生长,而120个UIA被归类为稳定(随访扫描中的大小> 1 mm)。从TOF-MRAS分割了UIA,并自动生成网格。我们研究了仅UIA网格的输入和包括UIA和周围母体血管在内的利益区域(ROI)网格。我们开发了一个分类模型来预测将增长或保持稳定的UIA。该模型由一个网状卷积神经网络组成,其中包括描述表面拓扑的形状指数和曲面的其他新型输入边缘特征。研究了输入边缘中点坐标是否影响模型性能。具有最高AUC(63.8%)的模型用于生长预测,使用了具有输入边缘中点坐标特征的UIA网格(平均F1得分= 62.3%,准确度= 66.9%,灵敏度= 57.3%,特异性= 70.8%)。我们提出了一个基于网状卷积神经网络的未来UIA增长预测模型,其结果有希望。
translated by 谷歌翻译
我们提出了广义的概率U-NET,该概率U-NET通过将高斯分布的更通用形式作为潜在空间分布来扩展概率的U-NET,可以更好地近似参考分段中的不确定性。我们研究了潜在空间分布的选择对使用LIDC-IDRI数据集捕获参考分割中的不确定性的效果。我们表明,分布的选择会影响预测的样本多样性及其相对于参考分割的重叠。对于LIDC-IDRI数据集,我们表明,使用高斯人的混合物会导致广义能量距离(GED)度量相对于标准概率U-NET的统计显着改善。我们已经在https://github.com/ishaanb92/generalizedprobabilisticunet上提供了实施。
translated by 谷歌翻译
深度学习技术在检测医学图像中的对象方面取得了成功,但仍然遭受虚假阳性预测,可能会阻碍准确的诊断。神经网络输出的估计不确定性已用于标记不正确的预测。我们研究了来自神经网络不确定性估计的功能和基于形状的特征,这些特征是根据二进制预测计算出的,从二进制预测中,通过开发基于分类的后处理步骤来减少肝病病变检测中的假阳性,以用于不同的不确定性估计方法。我们证明了两个数据集上所有不确定性估计方法的神经网络的病变检测性能(相对于F1分数)的改善,分别包括腹部MR和CT图像。我们表明,根据神经网络不确定性估计计算的功能往往不会有助于降低假阳性。我们的结果表明,诸如阶级不平衡(真实假阳性比率)和从不确定性图提取的基于形状的特征之类的因素在区分假阳性和真实阳性预测方面起着重要作用
translated by 谷歌翻译
在医学成像中,获得大量标记数据通常是一个障碍,因为注释和病理很少。异常检测是一种能够检测到看不见的异常数据的方法,而仅对正常(未经注释)数据进行培训。存在基于生成对抗网络(GAN)的几种算法来执行此任务,但是由于gan的不稳定,存在某些局限性。本文提出了一种新方法,通过将现有方法Ganomaly与逐渐增长的甘纳斯相结合。考虑到其产生高分辨率图像的能力,后者更稳定。该方法是使用时尚MNIST,医学分布分析挑战(情绪)和内部脑部MRI测试的;使用尺寸16x16和32x32的斑块。渐进式甘诺利(Ganomaly)的表现优于一级SVM或时尚MNIST的常规甘诺利。人工异常是在具有不同强度和直径的情绪图像中创建的。渐进式甘加诺利检测到强度和大小不同的最大异常。此外,从渐进的甘诺利中证明,间歇性重建也更好。在内部脑部MRI数据集上,常规甘诺利优于其他方法。
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
我们介绍了CheBlieset,一种对(各向异性)歧管的组成的方法。对基于GRAP和基于组的神经网络的成功进行冲浪,我们利用了几何深度学习领域的最新发展,以推导出一种新的方法来利用数据中的任何各向异性。通过离散映射的谎言组,我们开发由各向异性卷积层(Chebyshev卷积),空间汇集和解凝层制成的图形神经网络,以及全球汇集层。集团的标准因素是通过具有各向异性左不变性的黎曼距离的图形上的等级和不变的运算符来实现的。由于其简单的形式,Riemannian公制可以在空间和方向域中模拟任何各向异性。这种对Riemannian度量的各向异性的控制允许平衡图形卷积层的不变性(各向异性度量)的平衡(各向异性指标)。因此,我们打开大门以更好地了解各向异性特性。此外,我们经验证明了在CIFAR10上的各向异性参数的存在(数据依赖性)甜点。这一关键的结果是通过利用数据中的各向异性属性来获得福利的证据。我们还评估了在STL10(图像数据)和ClimateNet(球面数据)上的这种方法的可扩展性,显示了对不同任务的显着适应性。
translated by 谷歌翻译
闭环大脑刺激是指捕获诸如脑电图(EEG)之类的神经生理学措施,迅速识别感兴趣的神经事件,并产生听觉,磁性或电刺激,从而精确地与大脑过程相互作用。这是一种基本神经科学的新方法,也许是临床应用,例如恢复降解记忆功能;但是,现有工具很昂贵,繁琐,并且具有有限的实验灵活性。在本文中,我们提出了Portiloop,这是一种基于深度学习的,便携式和低成本的闭环刺激系统,能够靶向特定的脑振荡。我们首先记录可以从市售组件构建的开放式软件实现。我们还提供了快速,轻巧的神经网络模型和探索算法,该算法自动优化了所需的脑振荡的模型超参数。最后,我们在实时睡眠主轴检测的具有挑战性的测试案例中验证了该技术,结果可与大规模在线数据注释主轴数据集(MODA;组共识)上的离线专家绩效相当。社区可以提供软件和计划,作为开放科学计划,旨在鼓励进一步开发并推动闭环神经科学研究。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译